Using Semi-Supervised Learning and Wikipedia to Train an Event Argument Extraction System
نویسندگان
چکیده
The paper presents a methodology for training an event argument extraction system in semi-supervised setting. We use Wikipedia and Wikidata to automatically obtain small noisily labeled dataset large unlabeled dataset. consists of clusters containing pages multiple languages. data is iteratively using learning combined with probabilistic soft logic infer the pseudo-label each example from predictions base learners. proposed applied about earthquakes terrorist attacks cross-lingual Our experiments show improvement results when methodology. achieves F1-score 0.79 only used, 0.84 trained according logic.
منابع مشابه
Minimally Supervised Event Argument Extraction using Universal Schema
The prediction of events and their participants is an important component of building a knowledge base automatically from text. Typically, the events of interest are domain-specific and not known in advance, and so it is often the case that little or no training data is available to learn the appropriate predictors. In this work, we propose a technique for distantly supervised event argument ex...
متن کاملEmploying Event Inference to Improve Semi-Supervised Chinese Event Extraction
Although semi-supervised model can extract the event mentions matching frequent event patterns, it suffers much from those event mentions, which match infrequent patterns or have no matching pattern. To solve this issue, this paper introduces various kinds of linguistic knowledge-driven event inference mechanisms to semi-supervised Chinese event extraction. These event inference mechanisms can ...
متن کاملSelf-Train LogitBoost for Semi-supervised Learning
Semi-supervised classification methods are based on the use of unlabeled data in combination with a smaller set of labeled examples, in order to increase the classification rate compared with the supervised methods, in which the total training is executed only by the usage of labeled data. In this work, a self-train Logitboost algorithm is presented. The self-train process improves the results ...
متن کاملSemi-supervised Learning Using an Unsupervised Atlas
In many machine learning problems, high-dimensional datasets often lie on or near manifolds of locally low-rank. This knowledge can be exploited to avoid the “curse of dimensionality” when learning a classifier. Explicit manifold learning formulations such as lle are rarely used for this purpose, and instead classifiers may make use of methods such as local co-ordinate coding or auto-encoders t...
متن کاملRelation Extraction Using Label Propagation Based Semi-Supervised Learning
Shortage of manually labeled data is an obstacle to supervised relation extraction methods. In this paper we investigate a graph based semi-supervised learning algorithm, a label propagation (LP) algorithm, for relation extraction. It represents labeled and unlabeled examples and their distances as the nodes and the weights of edges of a graph, and tries to obtain a labeling function to satisfy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Informatica
سال: 2022
ISSN: ['0350-5596', '1854-3871']
DOI: https://doi.org/10.31449/inf.v46i1.3577